human player
Cheating just three times massively ups the chance of winning at chess
It isn't always easy to detect cheating in chess Just three judiciously deployed cheats can turn an otherwise equal chess game into a near-certain victory, a new analysis shows - and systems designed to crack down on cheating might not notice the foul play. Daniel Keren at the University of Haifa in Israel simulated 100,000 matches using the powerful Stockfish chess engine - a computer system that, at its maximum power, is better at playing chess than any human world champion. The matches were played between two computer engines competing at the level of an average chess player - 1500 on the Elo rating scale typically used to calculate skill level in chess. Half the games were logged without any further intervention, while the other half allowed occasional intervention by a stronger computer chess "player" with an Elo score of 3190 - a higher rating than any human player has ever achieved. Competitors usually have a slim advantage when playing white, with a 51 per cent chance of winning, on average, tied to the fact that they make the game's first move.
- Asia > Middle East > Israel > Haifa District > Haifa (0.25)
- Europe > Germany > Rheinland-Pfalz > Mainz (0.05)
- Asia > China (0.05)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Games > Chess (0.73)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Sichuan Province > Chengdu (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
06d5ae105ea1bea4d800bc96491876e9-AuthorFeedback.pdf
We thank all the reviewers for the constructive comments. We address the major concerns below. Reproducibility: 1) learning to draft details; 2) feature details; 3) discussions on the computing resources used. The search tree is updated based on four steps of MCTS. The learning rate is set to 0.001 with Adam.
Finding Friend and Foe in Multi-Agent Games
Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on The Resistance: Avalon, the most popular hidden role game. DeepRole combines counterfactual regret minimization (CFR) with deep value networks trained through self-play.
Playing the Player: A Heuristic Framework for Adaptive Poker AI
Paterson, Andrew, Sanders, Carl
For years, the discourse around poker AI has been dominated by the concept of solvers and the pursuit of unexploitable, machine-perfect play. This paper challenges that orthodoxy. It presents Patrick, an AI built on the contrary philosophy: that the path to victory lies not in being unexploitable, but in being maximally exploitative. Patrick's architecture is a purpose-built engine for understanding and attacking the flawed, psychological, and often irrational nature of human opponents. Through detailed analysis of its design, its novel prediction-anchored learning method, and its profitable performance in a 64,267-hand trial, this paper makes the case that the solved myth is a distraction from the real, far more interesting challenge: creating AI that can master the art of human imperfection.
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.46)
912d2b1c7b2826caf99687388d2e8f7c-AuthorFeedback.pdf
We thank all three reviewers for their comments and insightful suggestions. We outline some of these changes here. Our approach uses CFR instead of MCTS. We've added the following sentence: "Compared to Does the proposed method generalize to other games such as werewolf or saboteur? . . . DeepRole could be applied directly to Saboteur. We mention in the discussion: "In future Need ablation and analysis -- we all know trained agents are vulnerable to adversarial human players -- e.g. the Another interesting observation is the bot does not need conversation.
A Appendix
T eleportation System is an exceptional strategy. The Skill I can send teammates back to the spring, and Skill II can teleport teammates to Da Qiao's vicinity. We use the teleport ratio to evaluate the regional intention's Skill I and Skill II's teleport rates increase by 0.76 and 0.92, respectively. Baseline and MGG agents also each play 30 games against human players.Method Experience Money Damage Kill/D eath/A ssist Player 14573.92 Due to confidentiality agreements, we can't reveal any more The core of the system is that teammates give more resources to the marksman in the early stage to quickly open the money gap with opponents.
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
Tan, Weihao, Li, Xiangyang, Fang, Yunhao, Yao, Heyuan, Yan, Shi, Luo, Hao, Ao, Tenglong, Li, Huihui, Ren, Hongbin, Yi, Bairen, Qin, Yujia, An, Bo, Liu, Libin, Shi, Guang
We introduce Lumine, the first open recipe for developing generalist agents capable of completing hours-long complex missions in real time within challenging 3D open-world environments. Lumine adopts a human-like interaction paradigm that unifies perception, reasoning, and action in an end-to-end manner, powered by a vision-language model. It processes raw pixels at 5 Hz to produce precise 30 Hz keyboard-mouse actions and adaptively invokes reasoning only when necessary. Trained in Genshin Impact, Lumine successfully completes the entire five-hour Mondstadt main storyline on par with human-level efficiency and follows natural language instructions to perform a broad spectrum of tasks in both 3D open-world exploration and 2D GUI manipulation across collection, combat, puzzle-solving, and NPC interaction. In addition to its in-domain performance, Lumine demonstrates strong zero-shot cross-game generalization. Without any fine-tuning, it accomplishes 100-minute missions in Wuthering Waves and the full five-hour first chapter of Honkai: Star Rail. These promising results highlight Lumine's effectiveness across distinct worlds and interaction dynamics, marking a concrete step toward generalist agents in open-ended environments.
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.45)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology > Software (0.93)
- Education (0.92)
Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning
Dewey, Richard, Botyanszki, Janos, Moallemi, Ciamac C., Zheng, Andrew T.
AI researchers have long focused on poker-like games as a testbed for environments characterized by multi-player dynamics, imperfect information, and reasoning under uncertainty. While recent breakthroughs have matched elite human play at no-limit Texas hold'em, the multi-player dynamics are subdued: most hands converge quickly with only two players engaged through multiple rounds of bidding. In this paper, we present Solly, the first AI agent to achieve elite human play in reduced-format Liar's Poker, a game characterized by extensive multi-player engagement. We trained Solly using self-play with a model-free, actor-critic, deep reinforcement learning algorithm. Solly played at an elite human level as measured by win rate (won over 50% of hands) and equity (money won) in heads-up and multi-player Liar's Poker. Solly also outperformed large language models (LLMs), including those with reasoning abilities, on the same metrics. Solly developed novel bidding strategies, randomized play effectively, and was not easily exploitable by world-class human players.
- North America > United States > Texas (0.24)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- (6 more...)
- Banking & Finance > Trading (1.00)
- Leisure & Entertainment > Games > Poker (0.48)
- Leisure & Entertainment > Games > Chess (0.46)
- Leisure & Entertainment > Games > Go (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (2 more...)
Can They Dixit? Yes they Can! Dixit as a Playground for Multimodal Language Model Capabilities
Balepur, Nishant, Nguyen, Dang, Ki, Dayeon
Multi-modal large language models (MLMs) are often assessed on static, individual benchmarks -- which cannot jointly assess MLM capabilities in a single task -- or rely on human or model pairwise comparisons -- which is highly subjective, expensive, and allows models to exploit superficial shortcuts (e.g., verbosity) to inflate their win-rates. To overcome these issues, we propose game-based evaluations to holistically assess MLM capabilities. Games require multiple abilities for players to win, are inherently competitive, and are governed by fix, objective rules, and makes evaluation more engaging, providing a robust framework to address the aforementioned challenges. We manifest this evaluation specifically through Dixit, a fantasy card game where players must generate captions for a card that trick some, but not all players, into selecting the played card. Our quantitative experiments with five MLMs show Dixit win-rate rankings are perfectly correlated with those on popular MLM benchmarks, while games between human and MLM players in Dixit reveal several differences between agent strategies and areas of improvement for MLM reasoning.
- North America > United States > Maryland (0.04)
- North America > Dominican Republic (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Middle East > Jordan (0.04)